Parallelization Strategies for Distributed Non Negative Matrix Factorization

نویسندگان

  • Ahmed Nagy
  • Massimo Coppola
  • Nicola Tonellotto
چکیده

Dimensionality reduction and clustering have been the subject of intense research efforts over the past few years [2]. They offer an approach of knowledge extraction from huge amounts of data. Although some of these techniques are effective at achieving lower data dimensions, very few focused on scaling the techniques to tackle data sets that might not fit into memory. Non negative matrix factorization is (NMF) one of the effective techniques that can be used to achieve dimensionality reduction, missing data prediction and clustering. NMF has been parallelized through shared memory and distributed memory. Our contribution lies in reaching a higher level of parallelism through proposing a new block division technique on a hadoop framework. Furthermore, we use the block-based technique to design an enhanced cascaded NMF [6]. We compare the division techniques that we propose, block-based and cascaded over block-based division to the column-based technique which exists in the literature [12]. The block-based technique performs 18% percent faster than the column based. It achieves higher convergence value than the cascaded technique by 23%

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition

Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...

متن کامل

A new approach for building recommender system using non negative matrix factorization method

Nonnegative Matrix Factorization is a new approach to reduce data dimensions. In this method, by applying the nonnegativity of the matrix data, the matrix is ​​decomposed into components that are more interrelated and divide the data into sections where the data in these sections have a specific relationship. In this paper, we use the nonnegative matrix factorization to decompose the user ratin...

متن کامل

Nonnegative Matrix Factorization: Algorithms and Parallelization

An alternative to singular value decomposition (SVD) in the information retrieval is the low-rank approximation of an original non-negative matrix A by its non-negative factors U and V . The columns of U are the feature vectors with no non-negative components, and the columns of V store the non-negative weights that serve for the combination of feature vectors. First experiments show that restr...

متن کامل

Experiments with Cholesky Factorization on Clusters of SMPs

Cholesky factorization of large dense matrices is an integral part of many applications in science and engineering. In this paper we report on experiments with different parallel versions of Cholesky factorization on modern high-performance computing architectures. For the parallelization of Cholesky factorization we utilized various standard linear algebra software packages and present perform...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012